課程資訊
課程名稱
分散式系統與雲端應用開發實務
Practices for Distributed Systems and Cloud Application Development 
開課學期
111-2 
授課對象
管理學院  資訊管理學研究所  
授課教師
莊裕澤 
課號
IM5057 
課程識別碼
725 U3380 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期四7,8,9(14:20~17:20) 
上課地點
管二103 
備註
限學士班三年級以上
總人數上限:70人 
 
課程簡介影片
 
核心能力關聯
本課程尚未建立核心能力關連
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

加簽表單:https://forms.gle/aNxS2hKpjK8bSRCs6
(2/23 23:59 截止)

由人工智慧、物聯網、大數據、雲端運算等新興科技所帶來的數位化浪潮正在衝擊與改變各種產業,「服務上雲端」也成為數位轉型的關鍵策略。有別於傳統大型應用系統的開發,雲端服務著重在快速部署、靈活與彈性擴充 (elastic & scalable),因此微服務 (microservices)、容器化 (containerization) 等概念逐漸取代了單體式架構 (monolithic)、虛擬化 (virtualization) 概念,成為業界目前開發雲端應用服務的主流。

本課程目標在提供學生分散式系統與雲端應用服務開發所需要的基礎理論知識與實務技能。課程的內容從分散式系統的基本知識開始,包含分散式演算法的設計, logical time, consensus, 容錯,P2P, blockchains, 到GFS, Hadoop, Ceph, Bigtables, MapReduce 等大型分散式檔案系統與運算架構,Dynamo 及 IPFS 等基於Distributed Hash Tables (DHTs) 的大型分散式儲存系統,到中介軟體 (middleware)、虛擬化概念 (virtualization),再到Docker containers, Kubernetes, Amazon ECS, Google cloud platforms 等目前雲端應用服務常用的開發、部署、擴充和管理工具。課程亦將邀請業界專家來協助授課,包括工具的使用及分享實務開發的經驗,讓學校的課程可以直接介接到業界的實務需求。

The purpose of the course is to provide students with the fundamental knowledge on the design and implementation of distributed and cloud systems, as well as the practices of popular tools for developing cloud applications. Topics to be covered include distributed algorithms, logical time, consensus, fault tolerance, P2P, blockchains, large distrivuted file and storage systems such as GFS, Hadoop, Ceph, Bigtables, MapReduce; DHT-based large storage systems such as Dynamo and IPFS; and Docker containers, Kubernetes, Amazon ECS, and Google cloud platforms. We will also invite guest speakers from the industry to share their expertises in the field. 

課程目標
提供分散式系統與雲端應用服務開發所需要的基礎理論知識與實務技能 
課程要求
具基本的網路技術知識與C++和web service程式設計能力

對想選修這門課的同學,要知道自己是否有足夠基礎能修這門課,可以試讀下面二篇論文,如果這論文對你而言太難,那你不適合修這門課(一學期下來我們有超過20篇這樣難度的論文要讀):
1. A distributed algorithm for minimum-weight spanning trees, Robert G. Gallager, Pierre A. Humblet, and P. M. Spira, ACM Transactions on Programming Languages and Systems, vol. 5, no. 1, pp. 66–77, Jan. 1983.
2. The hadoop distributed file system: Architecture and design, D. Borthakur, 2007 , or the web page version HDFS Architecture Guide, Apache

註:ACM, IEEE, Elsevier等期刊會議論文資料庫的下載需登錄台大網域,校外可用VPN連線。

本課程不接受旁聽,不必來信申請。 
預期每週課後學習時數
~8 
Office Hours
 
指定閱讀
隨個單元指定相關論文、網路教材與資源 
參考書目
1. Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 2011.
2. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, Martin Kleppmann, 2017, O'Reilly. 
評量方式
(僅供參考)
 
No.
項目
百分比
說明
1. 
lab practices 
5% 
 
2. 
mid term project 
25% 
with possibility of extra points 
3. 
Term project 
20% 
3~4人一組 (期末搭配組內參與度互評防止freerider) 
4. 
期中考 
22% 
 
5. 
期末考 
28% 
 
 
課程進度
週次
日期
單元主題
第1週
2/23  Introduction: Characteristics of Distributed Systems
Reading:
Ch.1, Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 2011. 
第2週
3/1  (online video) Guest lecturer: Docker Containers, Stefan Hong, CTO & cofounder, Taiwan AI Labs
Ref.:
1. Docker overview
2. Container and Microservice Driven Design for Cloud Infrastructure DevOps, IEEE IC2E 2016. 
第3週
3/2  Basics of Distributed Systems, Part I: System Models, Time and Logical Clock, Synchronization, Coordination, Mutual Exclusion, Leader Election
Reading:
1. Design and Analysis of Distributed Algorithms, Chap. 3, Election: 8.2. YoYo, Nicola Santoro, 2006.
2. A distributed algorithm for minimum-weight spanning trees, Robert G. Gallager, Pierre A. Humblet, and P. M. Spira, ACM Transactions on Programming Languages and Systems, vol. 5, no. 1, pp. 66–77, Jan. 1983.
An enhanced version by Guy Flysher and Amir Rubinshtein.
3. Ch.2, 14-15, Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 2011. 
第4週
3/9  Basics of Distributed Systems, Part II: Minimum Spanning Tree, Name Services & Directory Services,
Transactions Processing and Concurrency Control
Reading:
1. Ch. 13, 16-17, Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 2011.
2. Ch.18, Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 2011.
3. Design and Analysis of Distributed Algorithms, Chap. 3, Election. Nicola Santoro, 2006. 
第5週
3/16  Midterm Project-Chord, Part 1: DHT layer implementation
P2P File Sharing Networks & Distributed Hash Tables (DHTs), Part I:
Reading:
1. Freenet: A Distributed Anonymous Information Storage and Retrieval System. Ian Clarke, et al., Springer 2001.
2. Incentives build robustness in bittorrent, B. Cohen., In
Workshop on Economics of Peer-to-Peer systems, vol. 6, 2003.
3. Chord: a scalable peer-to-peer lookup protocol for internet applications. Ion Stoica. et al., IEEE/ACM Transactions on Networking, Vol 11, Issue 1, Feb. 2003. 
第6週
3/23  Guest lecturer: Google Cloud Platform, Google Kubernetes Engine (GKE), Browny Lin.
Ref.:
kubernetes
 
第7週
3/30  Midterm Project-Chord, Part 2: Deploying to the cloud (AWS).
Guest lecturer from Amazon AWS: Introduction to Amazon Elastic Container Registry, Cathy Lai
Ref.:
Amazon EC2 
第8週
4/6  Large Distributed File System, Part I:
HDFS - Hadoop Distributed File System (HDFS)
The Google File System
Reading:
1. Ch.12, Distributed Systems: Concepts and Design 5th Ed., C. Coulouris et al., 5th ed., 2011.
2. The hadoop distributed file system: Architecture and design, D. Borthakur, 2007 , or the web page version HDFS Architecture Guide, Apache.
3. The Hadoop Distributed File System, K. Shvachko, H. Kuang, S Radia, R. Chansler, IEEE MSST 2010
4. The Google File System, Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, ACM SOSP 2003 
第9週
4/13  Midterm Project-Chord, Part 3: File System over Chord.
期末報告分組
Large Distributed File System, Part II
Ceph & RADOS
Reading:
1. Ceph: a scalable, high-performance distributed file system, S. A. Weil, et al., OSDI 2006.
2. RADOS: a scalable, reliable storage service for petabyte-scale storage clusters, S. A. Weil, et al., PDSW 2007
3. CRUSH Controlled, Scalable, Decentralized Placement of Replicated Data, S. A. Weil, et al., SC2006

Large Distributed Storage Systems: Google Bigtable
Reading:
1. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services, S. Gilbert & N. Lynch, ACM SIGACT News 2002.06.
2. Bigtable: A Distributed Storage System for Structured Data, F. Chang, et al., ACMTOCS 2008.
References:
1. Overview of Cloud Bigtable, Google. 
第10週
4/20  midterm 
第11週
4/27  P2P File Sharing Networks & Distributed Hash Tables (DHTs), Part II:
Reading:
1. A scalable content-addressable network, Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker, ACM SIGCOMM Computer Communication Review.
2. Kademlia: A Peer-to-Peer Information System Based on the XOR Metric, Petar Maymounkov & David Mazières, IPTPS 2002.
3. OceanStore: an architecture for global-scale persistent storage. J. Kubiatowicz, et al., ACM SIGOPS, 2000.
4. S/kademlia: A practicable approach towards secure key-based routing. I. Baumgart and S. Mies. International Conference on Parallel and Distributed Systems, 2007.
5. Sloppy Hashing and Self-Organizing Clusters. M. J. Freedman. E. Freudenthal, and D. Mazieres, IPTPS 2003. 
第12週
5/4  Large Distributed Storage Systems: Chubby, Amazon Dynamo, Mongo DB, IPFS
Reading:
1. The Chubby lock service for loosely-coupled distributed systems. M. Burrows, OSDI 2006.
2. Dynamo: Amazon’s highly available key-value store. Giuseppe DeCandia, et al., ACM SIGOPS Operating Systems Review, October 2007.
3. IPFS - Content Addressed, Versioned, P2P File System (DRAFT 3). Juan Benet, 2014.
3'. IPFS Concepts & Documents. IPFS. 
第13週
5/11  期末專題提案報告(每組5分鐘) (原預定於4/27,延後到本周)
Consensus, part1: Impossibility results, part2: Paxos Algorithm (online, est. 3 hrs of study)
consensus, part 3: Byzantine Fault Tolerances (online, est. 1 hrs of study)
Reading:
1. Impossibility of distributed consensus with one faulty process. Fischer, M. J.; Lynch, N. A.; Paterson, M. S., Journal of the ACM. 32 (2): 374–382, 1985.
2. Paxos Made Simple. L. Lamport, ACM SIGACT News 32(4), pp. 51-58, 2001.
2'. (alternative) The Part-Time Parliament. L. Lamport, ACM TOCS 16(2), pp. 133-169, 1998.
3. The Byzantine Generals Problem. Lamport, L.; Shostak, R.; Pease, M. , ACM Transactions on Programming Languages and Systems. 4 (3): 382–401, 1982. 
第14週
5/18  Blockchains
Reading:
1. Bitcoin: A peer-to-peer electronic cash system. S Nakamoto, Decentralized Business Review, 2008.
2. Ethereum: A next-generation smart contract and decentralized application platform. Vitalik Buterin, white paper, 2014.
3. Blockchain Consensus Protocols in the Wild. Christian Cachin, Marko Vukolić. arXiv:1707.01873, 2017.
4. Blockchain for decentralization of Internet: prospects, trends, and challenges. J. Zarrin, et al., Cluster Computing, vol. 24, (2021)
Ref.:
1. Ethereum Whitepaper.
2. OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding. E. Kokoris-Kogias, et al. IEEE Symposium on Security and Privacy (SP), 2018.
3. SoK: Consensus in the Age of Blockchains. S. Bano, et al. ACM Conference on Advances in Financial Technologies, 2019.
 
第15週
5/25  home study 
第16週
6/1  final exam 
第17週
6/8  Term Projects Demo